智能论文笔记

A Review of Published Machine Learning Natural Language Processing Applications for Protocolling Radiology Imaging

Nihal Raju , Michael Woodburn , Stefan Kachel , Jack O'Shaughnessy , Laurence Sorace , Natalie Yang , Ruth P Lim

分类：计算机视觉

2022-06-23

机器学习（ML）是人工智能（AI）的子场，其放射学中的应用正在以不断加速的速度增长。研究最多的ML应用程序是图像的自动解释。但是，可以将自然语言处理（NLP）与文本解释任务组合的ML结合使用，在放射学中也具有许多潜在的应用。一种这样的应用是放射学原始胶体的自动化，涉及解释临床放射学转介并选择适当的成像技术。这是一项必不可少的任务，可确保执行正确的成像。但是，放射科医生必须将专门用于原始胶片的时间进行报告，与推荐人或教学进行报告，交流。迄今为止，很少有使用临床文本自动选择协议选择的ML模型的出版物。本文回顾了该领域的现有文献。参考机器学习公约建议的最佳实践对已发布模型进行系统评估。讨论了在临床环境中实施自动质胶的进展。

translated by 谷歌翻译

Autonomous Vehicle Navigation with LIDAR using Path Planning

Rahul M K , Sumukh B , Praveen L Uppunda , Vinayaka Raju , C Gururaj

分类：机器人

2022-12-14

In this paper, a complete framework for Autonomous Self Driving is implemented. LIDAR, Camera and IMU sensors are used together. The entire data communication is managed using Robot Operating System which provides a robust platform for implementation of Robotics Projects. Jetson Nano is used to provide powerful on-board processing capabilities. Sensor fusion is performed on the data received from the different sensors to improve the accuracy of the decision making and inferences that we derive from the data. This data is then used to create a localized map of the environment. In this step, the position of the vehicle is obtained with respect to the Mapping done using the sensor data.The different SLAM techniques used for this purpose are Hector Mapping and GMapping which are widely used mapping techniques in ROS. Apart from SLAM that primarily uses LIDAR data, Visual Odometry is implemented using a Monocular Camera. The sensor fused data is then used by Adaptive Monte Carlo Localization for car localization. Using the localized map developed, Path Planning techniques like "TEB planner" and "Dynamic Window Approach" are implemented for autonomous navigation of the vehicle. The last step in the Project is the implantation of Control which is the final decision making block in the pipeline that gives speed and steering data for the navigation that is compatible with Ackermann Kinematics. The implementation of such a control block under a ROS framework using the three sensors, viz, LIDAR, Camera and IMU is a novel approach that is undertaken in this project.

translated by 谷歌翻译

Persona-Based Conversational AI: State of the Art and Challenges

Junfeng Liu , Christopher Symons , Ranga Raju Vatsavai

分类：自然语言处理 | 人工智能 | 机器学习

2022-12-04

Conversational AI has become an increasingly prominent and practical application of machine learning. However, existing conversational AI techniques still suffer from various limitations. One such limitation is a lack of well-developed methods for incorporating auxiliary information that could help a model understand conversational context better. In this paper, we explore how persona-based information could help improve the quality of response generation in conversations. First, we provide a literature review focusing on the current state-of-the-art methods that utilize persona information. We evaluate two strong baseline methods, the Ranking Profile Memory Network and the Poly-Encoder, on the NeurIPS ConvAI2 benchmark dataset. Our analysis elucidates the importance of incorporating persona information into conversational systems. Additionally, our study highlights several limitations with current state-of-the-art methods and outlines challenges and future research directions for advancing personalized conversational AI technology.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Cementron: Machine Learning the Constituent Phases in Cement Clinker from Optical Images

Mohd Zaki , Siddhant Sharma , Sunil Kumar Gurjar , Raju Goyal , Jayadeva , N. M. Anoop Krishnan

分类：计算机视觉

2022-11-06

Cement is the most used construction material. The performance of cement hydrate depends on the constituent phases, viz. alite, belite, aluminate, and ferrites present in the cement clinker, both qualitatively and quantitatively. Traditionally, clinker phases are analyzed from optical images relying on a domain expert and simple image processing techniques. However, the non-uniformity of the images, variations in the geometry and size of the phases, and variabilities in the experimental approaches and imaging methods make it challenging to obtain the phases. Here, we present a machine learning (ML) approach to detect clinker microstructure phases automatically. To this extent, we create the first annotated dataset of cement clinker by segmenting alite and belite particles. Further, we use supervised ML methods to train models for identifying alite and belite regions. Specifically, we finetune the image detection and segmentation model Detectron-2 on the cement microstructure to develop a model for detecting the cement phases, namely, Cementron. We demonstrate that Cementron, trained only on literature data, works remarkably well on new images obtained from our experiments, demonstrating its generalizability. We make Cementron available for public use.

translated by 谷歌翻译

Applying Association Rules Mining to Investigate Pedestrian Fatal and Injury Crash Patterns Under Different Lighting Conditions

Ahmed Hossain , Xiaoduan Sun , Raju Thapa , Julius Codjoe

分类： (统计)机器学习 | 机器学习

2022-11-06

The pattern of pedestrian crashes varies greatly depending on lighting circumstances, emphasizing the need of examining pedestrian crashes in various lighting conditions. Using Louisiana pedestrian fatal and injury crash data (2010-2019), this study applied Association Rules Mining (ARM) to identify the hidden pattern of crash risk factors according to three different lighting conditions (daylight, dark-with-streetlight, and dark-no-streetlight). Based on the generated rules, the results show that daylight pedestrian crashes are associated with children (less than 15 years), senior pedestrians (greater than 64 years), older drivers (>64 years), and other driving behaviors such as failure to yield, inattentive/distracted, illness/fatigue/asleep. Additionally, young drivers (15-24 years) are involved in severe pedestrian crashes in daylight conditions. This study also found pedestrian alcohol/drug involvement as the most frequent item in the dark-with-streetlight condition. This crash type is particularly associated with pedestrian action (crossing intersection/midblock), driver age (55-64 years), speed limit (30-35 mph), and specific area type (business with mixed residential area). Fatal pedestrian crashes are found to be associated with roadways with high-speed limits (>50 mph) during the dark without streetlight condition. Some other risk factors linked with high-speed limit related crashes are pedestrians walking with/against the traffic, presence of pedestrian dark clothing, pedestrian alcohol/drug involvement. The research findings are expected to provide an improved understanding of the underlying relationships between pedestrian crash risk factors and specific lighting conditions. Highway safety experts can utilize these findings to conduct a decision-making process for selecting effective countermeasures to reduce pedestrian crashes strategically.

translated by 谷歌翻译

Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities

Pranav Dheram , Murugesan Ramakrishnan , Anirudh Raju , I-Fan Chen , Brian King , Katherine Powell , Melissa Saboowala , Karan Shetty , Andreas Stolcke

分类：自然语言处理

2022-07-22

至于其他形式的AI，最近已经对不同用户同伙的性能差异进行了研究。在语音识别方面实现公平性的一种方法是（1）确定遭受低标准表现的说话者队列，以及（2）采取针对发现同类的公平性缓解措施。在本文中，我们使用产品规模的AI助手语音识别系统的数据报告了发现和缓解性能差异的初步发现。我们将基于地理和人口统计学信息的队列发现与一种更可扩展的方法进行比较，该方法将使用扬声器嵌入技术分组没有人类标签的说话者。为了缓解公平性，我们发现对代表性不足的队列的过度采样，以及通过其他输入变量对扬声器队列的建模，从而减少了表现和底部性能队列之间的差距，而不会降低整体识别精度。

translated by 谷歌翻译

ILASR: Privacy-Preserving Incremental Learning for AutomaticSpeech Recognition at Production Scale

Gopinath Chennupati , Milind Rao , Gurpreet Chadha , Aaron Eakin , Anirudh Raju , Gautam Tiwari , Anit Kumar Sahu , Ariya Rastrow , Jasha Droppo , Andy Oberlin

分类：自然语言处理 | 人工智能

2022-07-19

增量学习是一种范式，可以通过流数据大规模构建模型构建和更新。对于端到端的自动语音识别（ASR）任务，缺乏人类注释的标签，以及需要保留模型建设政策的隐私政策，这使其成为艰巨的挑战。受这些挑战的激励，在本文中，我们使用基于云的框架为生产系统展示了从隐私保存自动语音识别（ILASR）的增量学习中的见解。我们的意思是，通过保留隐私性，对没有人类注释的短暂数据使用。该系统是用于增量/持续学习的生产LevelAsASR模型的一步，该模型提供了接近实时测试床，以在云中进行端到端ASR实验，同时遵守保留隐私的政策。我们表明，即使在没有人类注释的标签的情况下，拟议的系统也可以在六个月的新时间内显着改善生产模型（3％），而在增量学习中，较弱的监督和大批量大小。在新时期，这种改进比测试集的新单词和短语相比为20％。我们在ASR的同时进一步探讨了拥有有效的教师模型和使用大批量大小的实用性的同时，以保护隐私的增量方式展示了模型构建的有效性。

translated by 谷歌翻译

MultiViz: An Analysis Benchmark for Visualizing and Understanding Multimodal Models

Paul Pu Liang , Yiwei Lyu , Gunjan Chhablani , Nihal Jain , Zihao Deng , Xingbo Wang , Louis-Philippe Morency , Ruslan Salakhutdinov

分类：机器学习 | 人工智能 | 自然语言处理 | 计算机视觉

2022-06-30

多模型对现实世界应用的承诺激发了可视化和理解其内部力学的研究，其最终目标是使利益相关者能够可视化模型行为，执行模型调试并促进对机器学习模型的信任。但是，现代的多模型模型通常是黑盒神经网络，这使得了解其内部力学变得具有挑战性。我们如何能在这些模型中可视化多模式相互作用的内部建模？我们的论文旨在通过提出Multiviz来填补这一空白，这是一种通过将可解释性问题分为4个阶段来分析多模型模型行为的方法：（1）单峰的重要性：每种模式如何有助于下游建模和预测，（2）交叉交叉。 - 模式相互作用：不同模态如何相互关系，（3）多模式表示：如何在决策级特征中表示单峰和跨模式的交互作用，以及（4）多模式预测：决策级特征如何组成以制造一个预言。 Multiviz旨在在不同的模式，模型，任务和研究领域进行操作。通过对6个现实世界任务的8个训练模型的实验，我们表明，Multiviz中的互补阶段共同使用户能够（1）模拟模型预测，（2）将可解释的概念分配给功能，（3）对模型错误分析执行错误分析，（4）使用错误分析到调试模型的见解。 Multiviz公开可用，将定期使用新的解释工具和指标进行更新，并欢迎社区的意见。

translated by 谷歌翻译

LRH-Net: A Multi-Level Knowledge Distillation Approach for Low-Resource Heart Network

Ekansh Chauhan , Swathi Guptha , Likith Reddy , Bapi Raju

分类：人工智能 | 机器学习

2022-04-11

心电图（ECG）监测心脏产生的电活动，用于检测致命的心血管疾病（CVD）。从传统上讲，为了捕获精确的电活动，临床专家使用多铅的心电图（通常为12条线索）。但是最近，大尺寸的深度学习模型已被用于检测这些疾病。但是，这样的模型需要大量的计算资源，例如巨大的记忆和漫长的推理时间。为了减轻这些缺点，我们提出了一个低参数模型，称为低资源心脏网络（LRH-NET），该模型使用较少的潜在客户在资源受限的环境中检测ECG异常。除此之外，还使用多层次知识蒸馏过程，以在我们提出的模型上获得更好的概括性能。多层次知识蒸馏过程将知识提炼成经过培训的LRH-NET，以减少在多个线索中训练的高级参数（教师）模型减少铅的铅，以减少性能差距。在Physionet-2020挑战数据集上评估了所提出的模型，输入受限。 LRH-NET的参数比检测CVD的教师模型小106倍。与教师模型相比，LRH-NET的性能缩放高达3.2％，推理时间降低了75％。与计算和参数密集的深度学习技术相反，提出的方法使用了使用低资源LRH-NET的ECG铅的子集，使其非常适合在边缘设备上部署。

translated by 谷歌翻译